92 research outputs found
Embedding Feature Selection for Large-scale Hierarchical Classification
Large-scale Hierarchical Classification (HC) involves datasets consisting of
thousands of classes and millions of training instances with high-dimensional
features posing several big data challenges. Feature selection that aims to
select the subset of discriminant features is an effective strategy to deal
with large-scale HC problem. It speeds up the training process, reduces the
prediction time and minimizes the memory requirements by compressing the total
size of learned model weight vectors. Majority of the studies have also shown
feature selection to be competent and successful in improving the
classification accuracy by removing irrelevant features. In this work, we
investigate various filter-based feature selection methods for dimensionality
reduction to solve the large-scale HC problem. Our experimental evaluation on
text and image datasets with varying distribution of features, classes and
instances shows upto 3x order of speed-up on massive datasets and upto 45% less
memory requirements for storing the weight vectors of learned model without any
significant loss (improvement for some datasets) in the classification
accuracy. Source Code: https://cs.gmu.edu/~mlbio/featureselection.Comment: IEEE International Conference on Big Data (IEEE BigData 2016
ALE: Additive Latent Effect Models for Grade Prediction
The past decade has seen a growth in the development and deployment of educational technologies for assisting college-going students in choosing majors, selecting courses and acquiring feedback based on past academic performance. Grade prediction methods seek to estimate a grade that a student may achieve in a course that she may take in the future (e.g., next term). Accurate and timely prediction of students' academic grades is important for developing effective degree planners and early warning systems, and ultimately improving educational outcomes. Existing grade prediction methods mostly focus on modeling the knowledge components associated with each course and student, and often overlook other factors such as the difficulty of each knowledge component, course instructors, student interest, capabilities and effort.
In this paper, we propose additive latent effect models that incorporate these factors to predict the student next-term grades. Specifically, the proposed models take into account four factors: (i) student's academic level, (ii) course instructors, (iii) student global latent factor, and (iv) latent knowledge factors. We compared the new models with several state-of-the-art methods on students of various characteristics (e.g., whether a student transferred in or not). The experimental results demonstrate that the proposed methods significantly outperform the baselines on grade prediction problem. Moreover, we perform a thorough analysis on the importance of different factors and how these factors can practically assist students in course selection, and finally improve their academic performance
- …